Character recognizer

ABSTRACT

A character recognizing apparatus capable of recognizing discriminatively characters written continuously, irregularly and/or containing modification is provided which apparatus includes an input unit for allowing a handwritten character to be inputted to thereby output coordinate-points string, a dictionary for storing therein a plurality of character codes and character patterns corresponding to the character codes, respectively, an element decomposition module for decomposing the coordinates string outputted from the input unit into a plurality of elements which constitute the character, and a matching module for determining corresponding distance values for the elements of the character pattern stored in the dictionary and the elements of character pattern of the inputted character for each of the character patterns stored in the dictionary, to thereby correct the distance values determined on the basis of the elements bearing no correspondence.

TECHNICAL FIELD

The present invention relates to a handwritten character recognizingapparatus for recognizing handwritten letters or characters on an onlinebasis.

BACKGROUND ART

A technique for recognizing characters such as simplified characters andcursive characters is described in JP-A-2-56689. More specifically,straight lines extending in one direction are extracted from strings ofcoordinate points which constitute a character. The straight linesextracted are sorted for selecting out a straight line of a large lengthas a substroke S1. Subsequently, the other line segments than theselected one are set as substrokes S2, whereon recognition is performedby making decision as to presence of a corresponding character in adictionary on the basis of positions and shapes of the substrokes S1 andS2.

With the conventional technique mentioned above, character recognitionis performed on the basis of shapes and dispositions of strokesrepresented by the coordinates strings which constitute a characterpattern. However, the conventional technique suffers problems such asmentioned below because the recognition is performed on the basis of allthe coordinates strings.

In general, incapability of recognizing the cursive characters and thesimplified characters can be ascribed to the difference between thecharacter patterns inputted and the character patterns stored in thedictionary. When a character is written cursively or continuously,extraneous elements or portions may be inputted, as a result of whichthe inputted character may present a shape which is utterly differentfrom that of a relevant character pattern stored in the dictionary.Consequently, in order to make it possible to recognize the cursively orcontinuously written character, it is necessary to discriminativelydetermine which portions of the inputted character pattern are requiredfor the character recognition and which portions are not required forthe recognition. Thus, difficulty is encountered in recognizing properlyor satisfactorily the cursively written characters even when therecognition is performed on the basis of all the coordinates.

With a view to solving the problem mentioned above, it is an object ofthe present invention to provide a character recognizing apparatus whichis capable of recognizing even the cursive characters written roughly inincorrect or irregular order and containing modifications and whichapparatus can lessen the load involved in the processing.

DISCLOSURE OF INVENTION

For achieving the object mentioned above, the character recognizingapparatus according to the present invention is characterized in that itincludes an input unit for allowing a handwritten character to beinputted to thereby output a coordinate-points string or strings, adictionary for storing therein a plurality of character codes andcharacter patterns corresponding to the character codes, respectively,an element decomposition module for decomposing the coordinates stringoutputted from the input unit into a plurality of elements whichconstitute the character, a matching module for determiningcorresponding distance values for elements of a character pattern storedin the dictionary and elements of the inputted character pattern foreach of character patterns stored in the dictionary and correcting thedetermined distance values on the basis of the elements which bear nocorrespondency, and a processing unit for displaying on a display unitthe character pattern for which the distance values are determinedsmall.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a view showing a configuration of a system according to thepresent invention.

FIG. 2 is a view showing a conventional scheme of character recognition.

FIG. 3 is a view showing a scheme of character recognition according tothe present invention.

FIG. 4 is a view showing processings as a whole which are executed inthe system according to the present invention.

FIG. 5 is a view showing a cuneiform approximation processing in FIG. 4.

FIGS. 6A and 6B are views showing a vertical/horizontal elementdecomposition processing in FIG. 5.

FIGS. 7A to 7C are views showing an input/dictionary matching processingin FIG. 4.

FIGS. 8A and 8B are views showing a feature-ANDing distance valuearithmetic processing in FIGS. 7A to 7C.

FIG. 9 is a view showing a detail discrimination processing.

FIGS. 10A and 10B are views showing a stroke or delineationinsufficiency check processing.

FIG. 11 is a view showing a link check processing.

FIG. 12 is a view showing a writing direction check processing.

FIGS. 13A to 13C are views for illustrating a processing scheme.

FIG. 14 is a view for illustrating a processing scheme.

FIGS. 15A to 15F are views for illustrating a processing scheme.

BEST MODES FOR CARRYING OUT THE INVENTION

In the following, description will be made of an online-type handwrittencharacter input apparatus according to the present invention byreference to the drawings.

FIG. 1 shows an online-type handwritten character input apparatusaccording to the present invention. A liquid crystal tablet 110 iscomposed of an input field and a display field. When the user inputs acharacter on the liquid crystal tablet by handwriting, the characterinputted is detected to be transferred to a pen manager 120 as atime-serial coordinate-points string. Upon reception of the time-serialcoordinate-points string from the liquid crystal tablet 110, the penmanager 120 transfers the coordinate-points string to a preprocessingmodule 140 when the coordinate-points string is decided as beingdestined for the character recognition in the light of field attributesof the liquid crystal tablet 110 which are defined by an application130. To this end, the application 130 is designed for executing anapplication program for dividing a display area on the liquid crystaltablet 110 into several fields while defining the field attributesthereof. By way of example, a part of the image screen of the liquidcrystal tablet may be defined as the character input field. Thepreprocessing module 140 is designed to receive the time-serialcoordinate-points string from the pen manager 120 to perform a samplingprocessing on the coordinate-points string having a thin portion incorrespondence to a high writing speed and a dense portion correspondingto a low writing speed for thereby making uniform the density of thecoordinate-points string and additionally perform normalization withregard to position and size. The coordinate-points string normalized inrespect to the position and the size and having the density uniformizedin this way is inputted to a cuneiform approximating module 5.

The cuneiform approximating module 150 is composed of an elementdecomposition module 151 and an element permutating module 152.

The element decomposition module 151 is designed to generate linesegments from the coordinate-points string having the densityuniformized by the preprocessing module 140 to thereby execute a patternmatching processing. The line segments as generated are transferred tothe element permutating module 152. In the element permutating module152, the line segments are arrayed such that line-segment elementsextending upwardly or downwardly are arrayed orderly from the topmostwith the line-segment elements extending rightwards or leftwards beingarrayed orderly from the leftmost in dependence on the positions. Inthat case, the line-segment elements are arrayed orderly in dependenceon the length thereof so that correspondence to the dictionary can beestablished on the basis of the positions of the strokes even for thecharacter written in an incorrect or irregular stroke sequence. Theresult of this orderly positioning process are held as it is, while theline-segment element string and the line-segment positional order dataare transferred to a matching module 170.

The matching module 170 is designed to carry out a dictionary matchingprocess on the basis of the line-segment element series and theline-segment positional order data sent from the cuneiform approximatingmodule 150 to thereby extract from the dictionary a character whichapproximates the character pattern inputted by handwriting by way of theliquid crystal tablet 110, whereon the character extracted is displayedin the display field on the liquid crystal tablet 110. Parenthetically,the dictionary 160 holds therein line-segment element series obtainedfrom the coordinate-points strings of character patterns throughcooperation of the preprocessing module 140 and the cuneiformapproximating module 150 together with character codes whileestablishing correspondences between the line-segment element series andthe character codes.

The matching module 170 is comprised of a matching manage module 171, anAND processing module 172 for determining distance values between theline-segment element series of the inputted character pattern andcharacter patterns stored in the dictionary, a dictionary distance valuecorrecting module 173 for extracting the line-segment elements existingonly in the dictionary for thereby correcting the distance value, aninput distance value correcting module 174 for extracting theline-segment element(s) existing only in the input pattern for therebycorrecting the distance value, and a recognition result output module175 for extracting the result of recognition on the basis of thedistance values obtained from the AND processing module 172, thedictionary distance value correcting module 173 and the input distancevalue correcting module 174 for thereby displaying the result ofrecognition in the display field on the liquid crystal tablet 110.

The matching manage module 171 is designed to read out the characterpatterns stored in the dictionary 160 one by one therefrom for therebyallowing the distance value relative to the inputted character patternto be subsequently computed by the AND processing module 172, thedictionary distance value correcting module 173 and the input distancevalue correcting module 174. More specifically, upon reception of theline-segment element series of the inputted character pattern and thedictionary-stored character pattern from the cuneiform approximatingmodule 150 and the dictionary 160, respectively, the matching managemodule 171 transfers the received data to the AND processing module 172,the dictionary distance value correcting module 173 and the inputdistance value correcting module 174 for thereby allowing the distancevalue between the inputted character pattern and the dictionary-storedcharacter pattern to be arithmetically determined while correcting thedistance value, whereby identifier or ID of the dictionary-storedcharacter pattern and the relevant distance value are transferred to therecognition result output module 175.

The AND processing module 172 is designed such that upon comparison ofthe line-segment element series of the dictionary pattern with theline-segment element series of the input character pattern separatelyfor the vertical line-segment elements or the horizontal line-segmentelements, the AND processing module establishes correspondences betweenthe line-segment elements of the character pattern of the dictionary andthe line-segment elements of the inputted character pattern which arelocated closely in respect to the position (i.e., in view of thepositional order resulting from the permutation performed by the elementpermutating module 171) to thereby classify three different types ofcases, i.e., (1) the case where the element or elements corresponding tothe inputted character pattern are found in the character pattern storedin the dictionary, (2) the case where the line-segment element orelements corresponding to the inputted character pattern are not foundin any character pattern stored in the dictionary, and (3) the casewhere the line-segment element or elements corresponding to thecharacter pattern stored in the dictionary are not found in the inputtedcharacter pattern, respectively, whereon the distance values between theline-segment elements are determined with a total sum of these distancevalues being set as the ultimate distance value. In the cases (2) and(3) mentioned above, the distance values are determined for detail bythe dictionary distance value correcting module 173 and the inputdistance value correcting module 174. Thus, for the present, thedistance values are presumed to be constant in the cases (2) and (3)mentioned above. The result of the processing, i.e., the correspondencesestablished between the line-segment elements and the distance valuesdetermined between the line-segment elements, are transferred to thedictionary distance value correcting module 173 by way of the matchingmanage module 171. At this juncture, it should be mentioned that whenthe distance value becomes excessively large, the processing performedfor the relevant dictionary-stored pattern may be terminated with amessage to such effect being sent to the matching manage module 171. Thedictionary distance value correcting module 173 is provided with a viewto taking into consideration modifications of the inputted characterpattern. When modification and/or cursive writing is so serious thatomission of some portion is incurred, there may arise such situationthat the inputted character pattern lacks in the line-segment element(s)which corresponds to the line-segment element(s) of the characterpattern stored in the dictionary. To cope with such situation, for theinput character pattern detected as lacking in the line-segment elementwhich corresponds to the line-segment element of the dictionarycharacter pattern as the result of the AND processing performed by themodule 172, (1) search is performed for other line-segment element thanthose of the inputted character pattern to thereby determine thedistance value for the searched line-segment element, if found, and (2)if otherwise, the distance value conforming to the size of the relevantline-segment element of the dictionary character pattern is set, whereonthe distance value determined in this way is set instead of thepredetermined constant value set by the AND processing module 172. Thereason why the distance value conforming to the size is employed can beexplained by the fact that large line-segment element is scarcelyomitted while small line-segment element is likely to be omitted. Then,the distance value determined in this way is set in place of thepredetermined constant value assigned by the AND processing module 172.The results of the processing, i.e., the correspondences establishedbetween the line-segment elements and the distance values determinedbetween the line-segment elements, are transferred to the input distancevalue correcting module 174 by way of the matching manage module 171.

The input distance value correcting module 174 is also provided with aview to taking into account the cursive writing of the inputtedcharacter. When a cursively or continuously written character isinputted, the line-segment element which corresponds to the continuouslywritten portion of the inputted character pattern may not be found inthe dictionary-stored character pattern. In that case, when line-segmentelements for which correspondence with the dictionary-stored characterpattern can be established exist in the inputted character pattern inprecedence and succession to the line-segment element thereof for whichcorrespondence can not be established, as viewed along the series ofline-segment elements of the character pattern arrayed in the writingorder, then the first-mentioned line-segment elements may duly beinterpreted as being written in continuation. Then, for the line-segmentelements capable of being interpreted as the continuously writtenportion, a small distance value is assigned. If otherwise, a largedistance value is assigned. Subsequently, the predetermined constantvalue set by the AND processing module 172 is replaced by the distancevalue mentioned just above. The correspondences between the line-segmentelements and the distance values between the line-segment elementsobtained as the result of the processing described above are transferredto the matching manage module 171.

Upon reception of the ID of the dictionary-stored character pattern orcharacter code and the relevant distance values from the matching managemodule 171, the recognition result output module 175 selects smalldistance values to array then in the value-based order, which representsthe result of the recognition processing. The result of recognition isonce transferred to the detail discrimination module 180, and uponreception of the result of recognition sent back from the detaildiscrimination module, the recognition result output module outputs itto the pen manager 2.

The detail discrimination module 180 is provided with a view to takinginto account even such character features which make disappearance inthe series of the line-segment elements as outputted from the elementdecomposition module 151 to thereby permutate orderly the result ofrecognition by taking into consideration the above-mentioned characterfeatures as well. More specifically, upon reception of the ID orcharacter code of the dictionary character pattern to which a smalldistance value is assigned and the relevant distance values thereof fromthe recognition result output module 175 as the result of recognitionprocessing, the detail discrimination module 8 checks the input patternwith regard to the detail features of the dictionary pattern, to therebypermutate orderly the result of recognition, as occasion requires, whichis then transferred to the recognition result output module 175.

In the following, the instant embodiment will be described inconjunction with operation procedure.

FIG. 4 illustrates a flow of processings as a whole executed by theonline-type handwritten character input apparatus according to thepresent invention.

In a processing 401, a character pattern inputted by the user inhandwriting is fetched as a time-serial coordinate-points string throughcooperation of the liquid crystal tablet 110, the pen manager 120 andthe application 130. In processing steps 402 and 403, a samplingprocessing is performed on the coordinate-points string which may havethin portions corresponding to high writing speed and dense portionscorresponding to low writing speed, for thereby making uniform thedensity of the coordinate-points string while executing normalizationwith regard to the position and the size by the preprocessing module140. For normalization of the coordinate-points string with regard tothe position and size thereof, a centroid of the character pattern, bymay of example, may be determined from the coordinate-points string ofthe inputted character pattern, whereon the position of the characterpattern is normalized by translating it so that the centroid overlapswith the origin, which is then followed by normalization of the size bymagnifying or contracting the pattern so that the mean value of thedistances to the individual coordinate points from the centroid(=origin) assumes a constant value.

Next, in a processing 404, a cuneiform approximation processing iscarried out by the cuneiform approximating module 150.

In the cuneiform approximation processing, a vertical/horizontal elementdecomposition processing 501 for decomposing the input pattern intovertical/horizontal line-segment elements is performed, which is thenfollowed by execution of an element permutating processing 503 forpermutating the line segments decomposed into the vertical/horizontalline-segment elements, as is illustrated in FIG. 5.

FIG. 6A shows a processing flow for the vertical/horizontal elementdecomposition processing 501.

Through a processing 601, the line segments are determined by extractingthe points which assume local MIN/MAX values as well as thestart/terminal points in the x-axis and y-axis directions from theinputted character pattern. More specifically, referring to FIG. 15A,when the coordinate-points string is traced, starting from the startpoint a inputted by the user, the y-axis coordinate assumes a minimumvalue at a point b. Subsequently, when the coordinate-points string istraced, starting from the point b, the y-axis coordinate value becomesmaximum at a point c (FIG. 15B). The result of determination of thelocal MIN/MAX values in the x-axis and y-axis directions in this mannerwill be such as illustrated in FIG. 15C.

However, the pattern obtained by interconnecting the points of the localMIN/MAX values in the x-axis and y-axis directions differs distinctlyfrom the character pattern inputted by the user, as can be seen fromFIG. 15D. For this reason, the processing for making the characterpattern shown in FIG. 15C approximate closely to the character patterninputted by the user is performed through processings 602 to 604.

In the processings 602 to 605, decision is made that the segment-basedapproximation is insufficient when the ratio between length of the linesegment and that of the coordinate-points string representing theoriginal stroke is decided to be smaller than a predetermined thresholdvalue a for every line segment constituted by the points as determined.In that case, interpolation is performed by using the midpoint of theoriginal stroke as the approximate point. FIG. 15E shows that since theratio of length between the line segment 1 and the original stroke 2 issmaller than the predetermined threshold value, the approximate point dis interpolated. A result of such interpolation is illustrated in FIG.15F.

Finally, in the processing 606, the points determined through theprocessings 601 to 605 are interconnected to thereby determine aline-segment series.

In this conjunction, the processing for interpolating the approximatepoints may be executed such that the area enclosed by the line-segmentelement and the original stroke is determined and then the approximatepoints are interpolated when the area as determined is greater than apredetermined threshold value, as illustrated in FIG. 6B.

The line-segment series determined in this way are permutated throughthe processing step 502 shown in FIG. 5, whereon matching of the inputwith the dictionary is performed through the processing step 405 shownin FIG. 4.

Details of the processing for matching the input pattern with thedictionary-stored pattern are illustrated in FIG. 7A.

With this processing, it is presumed that the distance values between aline-segment series and all the dictionary patterns are determined. In aprocessing 701, distance value for the elements for which coincidence isfound between the coordinate-points string and the dictionary pattern isarithmetically determined. By way of example, for a character patternindicated by an input cuneiform in FIG. 7A and a character patternindicated by a dictionary cuneiform, thick solid line portions shown inFIG. 7B represent ANDed portions for which the two patterns coincidewith each other. Accordingly, the distance values for these portions arearithmetically determined.

A processing step 701 for determining arithmetically the distance valueby logically ANDing these feature elements is illustrated in detail inFIGS. 8A and 8B.

In a processing 801, “dist₁₃ cpl”, “dist₁₃ i₁₃ sng” and “dist₁₃ d₁₃ sng”are initialized. In this conjunction, “dist₁₃ cpl” represents a variablefor holding the distance values for the line segment element of theinput pattern and the element of the dictionary pattern for whichcorrespondence can be established. In more concrete, in the case of theexample illustrated in FIG. 7B, this variable represents the distancevalues for the portions indicated by thin solid lines. On the otherhand, “dist₁₃ i₁₃ sng” represents a variable for holding the distancevalues for the elements of the input pattern for which correspondence tothe elements of the dictionary pattern can not be established. In thecase of the example illustrated in FIG. 7B, this corresponds to portionsof the input pattern indicated by thin solid lines. Furthermore, “dist₁₃d₁₃ sng” represents a variable for holding the distance values for theelement of the input pattern and the element of the dictionary patternfor which correspondence can not be established.

Subsequently, “cpl₁₃ i(i)” is initialized to “−1” through processings802 to 804. In this conjunction, “cpl₁₃ i(i)” represents a variable forholding the element ID number j of the element of the dictionary patternfor which correspondence to the line segment element i of the inputpattern can be established. The variables are classified into fourdirection groups “↓”, “→”, “↑”, and “←” in dependence on the writingdirections of the elements. By virtue of such classification, element jof a dictionary pattern for which correspondence to the element i of theinput pattern is to be established can be searched speedily from thesame classified group in the succeeding processing 810. Incidentally,such classification may be spared.

Subsequently, the elements as classified are sorted in accordance withthe line segment length in a processing 805. Owing to such sortprocessing, the element j of the dictionary pattern for whichcorrespondence is to be established relative to the element i of theinput pattern in the processing 810 can be detected speedily byperforming the search in the sorted order. Incidentally, such sortingmay be spared.

In succession, through the processings 806 to 808, “cpl₁₃ d(j)” isinitialized to “−1” and the elements are classified into four directiongroups “↓”, “→”, “↑”, and “←” on a writing-direction basis similarly tothe processings 802 to 804.

Subsequently, in the processings 810 to 814, matching is performed. Atfirst, in the processing 810 for searching the dictionary patternelement j to which correspondence to the element i of the input patternis to be established, the element j of the dictionary pattern which isclassified into the same writing direction as the element i of the inputpattern and which satisfies the condition that cpl₁₃ d(j)=−1, i.e., theelement for which the distance value is smallest among the elements forwhich no correspondence to any one of the input elements has not beenestablished yet, is searched. To this end, the distance value can bearithmetically determined in accordance with, for example, theundermentioned expression for the line segment for which the startingpoint of the input pattern element i is given by (xis, yis) with theterminal point thereof being given by (xie, yie) and for which thestarting point of the dictionary pattern element j is given by (xjs,yjs) with the terminal point thereof being given by (xje, yje).$\begin{matrix}{{{Distance}\quad {value}\quad \left( {i,j} \right)} =} \\{= \quad {a*\left( {{{{xis} - {xjs}}} + {{{yis} - {yjs}}} + {{{xie} - {xje}}} +} \right.}} \\{\left. \quad {{{yie} - {yje}}} \right) +} \\{\quad {{b*\left( {\left. {{xie} - {xis}} \right) - \left( {{xje} - {xjs}} \right)} \right.} + {{\left( {{yie} - {yis}} \right) -}}}} \\{\quad \left( {{yje} - {yjs}} \right)}\end{matrix}$

In the above expression, the first term is for determining thedifference between the position of the line segment i of the inputtedcharacter pattern and the position of the line segment of thedictionary-stored character pattern and is indispensably required forthe character recognition without resorting to the use of informationconcerting the writing order. If otherwise, it will be impossible torecognize such input pattern as “” illustrated in FIGS. 13A to 13C.

In the processing 811, the distance value (i, j) obtained through theprocessing 810 is compared with a threshold value to thereby decidewhether the correspondence established for (i, j) is correct or not.This processing 811 is effective for preventing elements of noisecomponents inputted due to unintentional stroke or the like from beingerroneously recognized as having a significant distance value withcorrespondence being established, as is illustrated in FIG. 14. When itis decided in the processing 811 that the correspondence as establishedis correct, then “cpl₁₃ i(i)” and “cpl₁₃ d(j)” are set to the respectiveelement ID numbers j, and i and the distance value (i, j) is added to“dist₁₃ cpl” in the processing 812. On the other hand, when it isdecided through the processing 813 that the correspondence asestablished is incorrect, then the length of the element i is added to“dist₁₃ i₁₃ sng” as the distance value indicating that correspondencecan not be established for the input pattern element i in the processing814. To this end, a given constant may be employed in place of thelength or alternatively a function having the length as a parameter maybe employed.

Use of the function having the length as a parameter is effective notonly because the distance value can be suppressed for such noise elementas illustrated in FIG. 14 but because a large distance value can beassigned in the case where correspondence can not be established for theline segment which occupies a large proportion of the character.

Subsequently, through the processings 815 to 817, “cpl₁₃ d(j)” ischecked for all the line segments “j=0 to J” of the dictionary-storedpattern, and for the line segments of “−1”, i.e., those for whichcorrespondence with the line segment(s) of the inputted characterpattern can not be established, the distance value is added to “dist₁₃d₁₃ sng” in the processing 817 as in the case of the processing 814.

In this manner, the distance values are calculated for thecorrespondence-established assigned features of the inputted characterpattern and the dictionary-stored character pattern.

Subsequently, in the processing 702 shown in FIG. 7A, the line segmentsearch is performed only for the dictionary to thereby calculate againthe distance value. In the case of the example illustrated in FIG. 7C,there remains no line segments of the character pattern of thedictionary for which correspondences to the line segments of theinputted character pattern could not be established. However, when suchline segment is found, then the processing similar to the processing 703may be carried out as described below.

In the processing 703, search for the feature elements is performed onlyfor the inputted character pattern to thereby compute renewedly thedistance values. By way of example, for the character pattern inputtedand the character pattern stored in the dictionary such as illustratedin FIG. 7C, portions 12, 14, etc. indicated by thick solid lines of thefeature elements of the input pattern represented by thick line strokesremain in the state where correspondences to the elements of thedictionary-stored pattern have not been established yet. Morespecifically, the portion 12 is inputted intermediately between theportions 11 and 13 which bear correspondence to the elements of thedictionary-stored pattern and thus can be interpreted as a portiongenerated by writing continuously the elements of the dictionary patternfor which correspondences to the portions 11 and 13 can be established.Accordingly, by imparting or assigning a smaller value to the portion 12than in the case where the interpretation is impertinent, the cursivecharacter can easily be recognized as well.

In this manner, the detail discrimination in the processing 406 shown inFIG. 4 can be performed on the basis of the distance values determinedbetween the input pattern and the dictionary pattern.

More specifically, since the strokes of the character are classifiedinto vertical bars extending upwardly/downwardly and horizontal barsextending leftwardly/rightwardly in the processing 404, the featuresconcerning curving or bending portions of the pattern such as featureswhich enable distinction between curve and angular corner may disappearundesirably. Accordingly, through this processing 406, a detaildiscrimination dictionary is provided separately from the dictionary160, and when the character approximated to a similar shape through thecuneiform approximation is not contained in the result of recognition,then the detail discrimination dictionary is referenced for checking theinputted character pattern with regard to the detailed features of thedictionary character pattern, whereon processing for permutating theresults of recognition is performed if it is required.

A processing flow of this detail discrimination processing 406 isillustrated in FIG. 9.

The detail discrimination is performed for the first to N-th dictionarycharacter patterns in the ascending order of the distance valuesobtained through the inputted/dictionary character pattern matchingprocessing 405. At first, for the first to N-th dictionary characterpatterns, a delineation insufficiency check 901, a link check 902, awriting direction check 903 and a angle/curve check 904 are performed inthe ascending order of the distance values through the processings 901to 905.

FIGS. 10A and 10B show a processing flow of the stroke or delineationinsufficiency check 901. With this processing, penalty of a large valueis added to the distance value in the case of stroke insufficiency evenfor a small element such as voiced sound symbol of “” shown in FIG. 1A.At first, in a processing step 1001, the distance value “dist” isinitialized to “0”. Further, in the processing 1002, link information“link [ ][ ]” is initialized to “−1”.

Subsequently, through processings 1003 to 1008, for all of broken-linecombinations of all the elements for which mutual coincidence is foundas to the end point in the processings 1003 to 1007, i.e., for all thelinked line segments, link information is placed in “link [ ][ ]” inprocessing 1004 or 1006. More specifically, the ID number of a linesegment having a terminal point coinciding with the start point of aline segment i is placed in “link [i][0]” while the ID number of a linesegment having a start point coinciding with the terminal point of theelement i is placed in “link [i][1]”. Then, the elements inputtedthrough a same number of strokes can be traced in a single continuationby referencing “link [i][1]”, wherein the start point of the strokeassumes value of “link [i][0]=−1”.

Thus, through the processings 1010 to 1013, the line segment of thevalue “link [i][0]=−1” is searched, and the line segment ID number isplaced at the stroke number (k). Unless correspondence is establishedfor all the line segments contained in the stroke (k), then the overalllength of the stroke (k) is added to the distance value as the penaltythrough the processings 1015 to 1020. In consideration of the case wherethe voiced sound symbol is inputted with very small strokes, a thresholdvalue is added to the distance value as the penalty instead of theoverall length of the stroke (k) in the processing step 1017 when it issmaller than the threshold value mentioned above. In the processing1015, referencing the line segment ID number i=stroke number (k) to“link [i][1]” while tracing all the line segments contained in thestrokes, it is checked whether correspondence has been established forall the line segments covered by the stroke (k) by checking whether“cpl₁₃ i(i)” or “cpl₁₃ d(j)” is “−1”. Similarly, in the processing 1016,the overall length of the strokes is determined by adding together thelengths of all the line segments by tracing continuously all the linesegments while referencing “link [i][1]”. Although the link informationas checked is placed in “link [ ][ ]” in the case of the instant case,the line segment ID numbers contained in the strokes may be placed in“link [ ][ ]” in precedence to the element permutating processing 503shown in FIG. 5. In that case, however, the dictionary capacityincreases if the information of “link [ ][ ]” is held for the dictionarypatterns. Accordingly, the input pattern may preferably be placed in“link [ ][ ]” in precedence to the element permutating processing withonly the dictionary pattern being set to “link [ ][ ]” in the processing1006 or 1008 shown in FIG. 10.

FIG. 11 shows a processing flow of the link check 902 illustrated inFIG. 9. With this processing, a penalty of a proper value is added tothe distance value in the case where the link statuses of the elementsdiffer, as exemplified by the pattern “” shown in FIG. 11 and the like.More specifically, a penalty of a maximum value is assigned in the casewhere one of the dictionary character pattern and the inputted characterpattern is given by one continuous stroke while the other is given bytwo discrete strokes and additionally one stroke is connected to adifferent element, as in the case of the pattern illustrated at a).Unless the stroke is not connected to the different element, a penaltyof a small value should preferably be imposed because the one strokementioned above can be regarded as being written in continuation. Apenalty is also assigned in the case where one of the dictionarycharacter pattern and the inputted character pattern extendscontinuously with the other extending continuously but containing aninterposing element of different species, as in the case of the patternillustrated at b). In that case, when the interposing element ofdifferent species has a length which is smaller than a predeterminedvalue, a penalty of a predetermined value is selected while when thelength of the interposing element exceeds the predetermined value, thevalue of a penalty to be imposed should preferably be selected independence on the length of the interposing element. On the other hand,when both the dictionary character pattern and the inputted characterpattern are of a same connection form, as illustrated at c), no penaltyis imposed.

The present invention resides in the character recognition technologywhich is capable of recognizing even the character written in anirregular stroke sequence. In this connection, for recognizingdiscriminatively utterly same patterns except for the stroke sequence ororder, the dictionary pattern and the input pattern may be checked as tocoincidence of the stroke order information in the link status checkprocessing.

FIG. 12 shows a processing flow of the writing direction checkprocessing 903. The purpose of this processing is to make it possible torecognize the character even when it is inputted in the reverse writingdirection. In a processing 1201, the writing directions for all theelements are aligned in terms of “→” and “↓” When all the elements arehorizontal/vertical bars, “↑” is converted to “↓” with “←” beingconverted to “→”, and the relevant conversion information is stored. Theelement written obliquely is classified into the vertical bar group orthe horizontal bar group to be subsequently converted in the mannermentioned above. In a processing 1202, the inputted/dictionary characterpattern matching processing described hereinbefore by reference to FIG.7A is executed. To this end, the matching may be carried out with allthe dictionary patterns or alternatively with only high-rank dictionarypattern candidates. Through processings 1203 to 1206, a penalty isassigned to the elements which are incoincident in the writingdirection, i.e., the elements whose writing direction differs from theoriginal writing direction.

In this conjunction, before executing the processing 903, check may beperformed as to two respects, i.e., (1) whether the writing directionsof the elements of the inputted pattern contain many “↑”, and “←” and(2) whether no hit candidates can be found because of very largedistance value for the candidate of upper rank. Only when the checkmentioned above results in affirmation, the processing 903 may becarried out. In this way, the time consumption involved in theprocessing can be saved.

In this way, for the dictionary patterns set as the candidates, thecandidate order is reexamined in the ascending order of the distancevalues in the processing 906 after the stroke or delineationinsufficiency check processing, link check processing, the writingdirection check processing and the angle/curve check processing havebeen carried out.

Thus, on the basis of the results of the detail recognition processing406 illustrated in FIG. 4, the distance values are outputted in theascending order as the result of recognition (processing 407).

INDUSTRIAL APPLICABILITY

As described in the foregoing, according to the present invention, therecan be provided a character recognition apparatus which can recognizediscriminatively even the characters modified due to rough writing,continuous writing and/or irregular stroke sequence with the load forthe recognition processing being reduced.

What is claimed is:
 1. A character recognizing apparatus, comprising aninput unit for allowing a handwritten character to be inputted tothereby output a coordinate-points string; a dictionary for storingtherein a plurality of character codes and a character patterncorresponding to each of the character codes; a decomposing unit fordecomposing the coordinate-points string outputted from said input unitinto a plurality of elements which constitute an input character, saiddecomposing unit generating a plurality of line segments from thecoordinate-point string; and a matching unit for obtaining a distancevalue between the line segments thus generated and line segments of acharacter pattern stored in said dictionary in a descending order oflength of the line segments thus generated for combinations of elementsof an input character pattern and elements of each of character patternsstored in said dictionary, and selecting a distance value which becomesa smallest distance between elements irrespective of an input order ofthe elements.
 2. A character recognizing apparatus, comprising: an inputunit for allowing a handwritten character to be inputted to therebyoutput a coordinate-points string; a dictionary for storing therein aplurality of character codes and a character pattern corresponding toeach of the character codes; a decomposing unit for decomposing thecoordinate-points string outputted from said input unit into a pluralityof elements which constitute an input character; and a matching unit forselecting a distance value which becomes a smallest distance betweenelements irrespective of an input order of the elements for combinationsof the elements of an input character pattern and elements of each ofcharacter patterns stored in said dictionary, wherein said decomposingunit traces orderly the coordinate-points string to thereby decomposethe coordinate-points string into line segments each having end pointsat which at least one of X- or Y-coordinates has a maximum or minimumvalue, compares length of a line segment thus decomposed with length ofa stroke constituted by interconnecting the coordinate-points string,and divides the line segment thus compared into two line segments eachhaving an end coinciding with a mid point of the line segment thuscompared, on the basis of result of said comparison.
 3. A characterrecognizing apparatus, comprising: an input unit for allowing ahandwritten character to be inputted to thereby output acoordinate-points string; a dictionary for storing therein a pluralityof character codes and a character pattern corresponding to each of thecharacter codes; a decomposing unit for decomposing thecoordinate-points string outputted from said input unit into a pluralityof elements which constitute an input character; and a matching unit forselecting a distance value which becomes a smallest distance betweenelements irrespective of an input order of the elements for combinationsof the elements of an input character pattern and elements of each ofcharacter patterns stored in said dictionary, wherein after establishingcorrespondence relationships between broken-line elements of the inputcharacter pattern and character patterns stored in said dictionary, saidmatching unit assigns a greater distance value in a case where thereexists no broken-line element corresponding to any broken-line elementof strokes constituted by the broken-line elements of the inputcharacter or the dictionary pattern, when compared with a case wherethere exists a broken-line element corresponding to some one ofbroken-line elements of strokes.